Multiple Predicate Learning for Document Image Understanding

نویسندگان

  • Floriana Esposito
  • Donato Malerba
  • Francesca A. Lisi
چکیده

Document image understanding denotes the recognition of semantically relevant components in the layout extracted from a document image. This recognition process is based on some visual models, whose manual specification can be a highly demanding task. In order to automatically acquire these models, we propose the application of machine learning techniques. In this paper, problems raised by possible dependencies between concepts to be learned are illustrated and solved with a computational strategy based on the separate-and-parallel-conquer search. The approach is tested on a set of real multi-page documents processed by the system WISDOM++. New results confirm the validity of the proposed strategy and show some limits of the machine learning system used in this work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Similarity measurement for describe user images in social media

Online social networks like Instagram are places for communication. Also, these media produce rich metadata which are useful for further analysis in many fields including health and cognitive science. Many researchers are using these metadata like hashtags, images, etc. to detect patterns of user activities. However, there are several serious ambiguities like how much reliable are these informa...

متن کامل

Learning recursive theories with the separate-and-parallel- conquer strategy

Inductive learning of recursive logical theories from a set of examples is a complex task characterized by three important issues that do not arise in single predicate learning, namely the adoption of a generality order stronger than θsubsumption, the non-monotonicity of the consistency property, and the automated discovery of dependencies between target predicates. Some variations of the class...

متن کامل

Machine Learning for Reading Order Detection in Document Image Understanding

Document image understanding refers to logical and semantic analysis of document images in order to extract information understandable to humans and codify it into machine-readable form. Most of the studies on document image understanding have targeted the specific problem of associating layout components with logical labels, while less attention has been paid to the problem of extracting relat...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001